{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Usage With Qdrant\n",
    "\n",
    "This notebook demonstrates how to use FastEmbed and Qdrant to perform vector search and retrieval. Qdrant is an open-source vector similarity search engine that is used to store, organize, and query collections of high-dimensional vectors. \n",
    "\n",
    "We will use the Qdrant to add a collection of documents to the engine and then query the collection to retrieve the most relevant documents.\n",
    "\n",
    "It consists of the following sections:\n",
    "\n",
    "1. Setup: Installing necessary packages, including the Qdrant Client and FastEmbed.\n",
    "2. Importing Libraries: Importing FastEmbed and other libraries\n",
    "3. Data Preparation: Example data and embedding generation\n",
    "4. Querying: Defining a function to search documents based on a query\n",
    "5. Running Queries: Running example queries\n",
    "\n",
    "## Setup\n",
    "\n",
    "First, we need to install the dependencies. `fastembed` to create embeddings and perform retrieval, and `qdrant-client` to interact with the Qdrant database."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install 'qdrant-client[fastembed]' --quiet --upgrade"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Importing the necessary libraries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from qdrant_client import QdrantClient"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data Preparation\n",
    "We initialize the embedding model and generate embeddings for the documents.\n",
    "\n",
    "### 💡 Tip: Prefer using `query_embed` for queries and `passage_embed` for documents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example list of documents\n",
    "documents: list[str] = [\n",
    "    \"Maharana Pratap was a Rajput warrior king from Mewar\",\n",
    "    \"He fought against the Mughal Empire led by Akbar\",\n",
    "    \"The Battle of Haldighati in 1576 was his most famous battle\",\n",
    "    \"He refused to submit to Akbar and continued guerrilla warfare\",\n",
    "    \"His capital was Chittorgarh, which he lost to the Mughals\",\n",
    "    \"He died in 1597 at the age of 57\",\n",
    "    \"Maharana Pratap is considered a symbol of Rajput resistance against foreign rule\",\n",
    "    \"His legacy is celebrated in Rajasthan through festivals and monuments\",\n",
    "    \"He had 11 wives and 17 sons, including Amar Singh I who succeeded him as ruler of Mewar\",\n",
    "    \"His life has been depicted in various films, TV shows, and books\",\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This tutorial demonstrates how to utilize the QdrantClient to add documents to a collection and query the collection for relevant documents."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ➕ Adding Documents\n",
    "\n",
    "The `add` creates a collection if it does not already exist. Now, we can add the documents to the collection:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 77.7M/77.7M [00:05<00:00, 14.6MiB/s]\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "['4fa8b10c78da4b18ba0830ba8a57367a',\n",
       " '2eae04b515ee4e9185a9a0e6be812bba',\n",
       " 'c6039f88486f47f1835ae3b069c5823c',\n",
       " 'c2c8c51e305144d1917b373125fb4d95',\n",
       " '79fd23b9ec0648cdab38d1947c6b933e',\n",
       " '036aa200d8c3492b8a438e4f825f5e7f',\n",
       " 'c35c77f3ea37460a9a13723fb77b7367',\n",
       " '6ebccbca571b40d0ab6e83e5e0f2f562',\n",
       " '38048c2ccc1d4962a4f8f1bd89c8357a',\n",
       " 'c6b09308360140c7b4f106af3658a31e']"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "client = QdrantClient(\":memory:\")\n",
    "client.add(collection_name=\"test_collection\", documents=documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These are the ids of the documents we just added. We don't have a use for them in this tutorial, but they can be used to update or delete documents."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 📝 Running Queries\n",
    "We'll define a function to print the top k documents based on a query, and prepare a sample query."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[42, 2]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Prepare your documents, metadata, and IDs\n",
    "docs = [\"Qdrant has Langchain integrations\", \"Qdrant also has Llama Index integrations\"]\n",
    "metadata = [\n",
    "    {\"source\": \"Langchain-docs\"},\n",
    "    {\"source\": \"Linkedin-docs\"},\n",
    "]\n",
    "ids = [42, 2]\n",
    "\n",
    "# Use the new add method\n",
    "client.add(collection_name=\"demo_collection\", documents=docs, metadata=metadata, ids=ids)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Behind the scenes, Qdrant Client uses the FastEmbed library to make a passage embedding and then uses the Qdrant API to upsert the documents with metadata, put together as a Points into the collection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[QueryResponse(id=42, embedding=None, metadata={'document': 'Qdrant has Langchain integrations', 'source': 'Langchain-docs'}, document='Qdrant has Langchain integrations', score=0.8276550115796268), QueryResponse(id=2, embedding=None, metadata={'document': 'Qdrant also has Llama Index integrations', 'source': 'Linkedin-docs'}, document='Qdrant also has Llama Index integrations', score=0.8265536935180283)]\n"
     ]
    }
   ],
   "source": [
    "search_result = client.query(\n",
    "    collection_name=\"demo_collection\", query_text=\"This is a query document\"\n",
    ")\n",
    "print(search_result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 🎬 Conclusion\n",
    "\n",
    "This tutorial demonstrates the basics of working with the QdrantClient to add and query documents. By following this guide, you can easily integrate Qdrant into your projects for vector similarity search and retrieval.\n",
    "\n",
    "Remember to properly handle the closing of the client connection and further customization of the query parameters according to your specific needs.\n",
    "\n",
    "The official Qdrant Python client documentation can be found [here](https://github.com/qdrant/qdrant-client) for more details on customization and advanced features."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "fst",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
